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Field Of The Invention 

[0001] The invention relates to systems and methods for identifying 
and/or measuring usage of media data gathered at a user location using 
remote decoding and/or pattern matching techniques. 

Background Of The Invention 

[0002] Techniques used to determine the programs or other content to 
which audience members have been exposed are intended to gather such 
data at the audience members' locations. Various systems have been 
proposed for this purpose. In one variant, a stationary device is positioned 
near a television, radio, computer, or the like, in order to monitor media data 
at audience locations. 

[0003] Another variant proposes the use of a portable device to be 
carried about by an audience member in order to gather data regarding the 
programs and other content to which the audience member has been 
exposed. 

[0004] These devices obtain the signals to be monitored either through 
a direct electrical connection, or by means of a sensor such as a microphone, 
light-sensitive device, capacitive pickup or magnetic sensor. Typically the 
device either detects the presence of an ancillary code in the media data or 
else extracts a signature therefrom for pattern matching, and stores the code 
or signature for subsequent processing at a remote location. In order to 
produce audience surveys which are statistically reliable, it is necessary to 
engage a relatively large number of survey participants, so that it is likewise 
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necessary to supply a relatively large number of monitoring devices, such as 
stationary or portable devices. It is, therefore, desirable to minimize the 
complexity of such devices in order to minimize their cost. 

Summary Of The Invention 

[0005] For this application the following terms and definitions shall 
apply, both for the singular and plural forms of nouns and for all verb tenses: 

[0006] The term "data" as used herein means any indicia, signals, 
marks, symbols, domains, symbol sets, representations and any other 
physical form or forms representing information, whether permanent or 
temporary, whether visible, audible, acoustic, electric, magnetic, 
electromagnetic or otherwise manifested. 

[0007] The term "set" as used herein means any collection of elements, 
things, or data. 

[0008] The term "amplitude" as used herein refers to values of energy, 
power, voltage, current, charge, intensity, size, magnitude, and/or pressure, 
however measured or evaluated, whether on an absolute or relative basis, on 
a discrete or continuous basis, on an instantaneous or accumulated basis, or 
otherwise. 

[0009] The term "media data" as used herein means data which is 
widely accessible, whether over-the-air, or via cable, satellite, network, 
internetwork (including the Internet), distributed on storage media, or 
otherwise, without regard to the form or content thereof, and including but not 
limited to audio data and video data. 

[00010] The terms "coupled", "coupled to" and "coupled with" as used 
herein each means a relationship between or among two or more devices, 
apparatus, files, programs, media, components, networks, systems, 
subsystems and/or means, constituting any one or more of (a) a connection 
whether direct or through one or more other devices, apparatus, files, 
programs, media, components, networks, systems, subsystems or means, (b) 
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a communications relationship whether direct or through one or more other 
devices, apparatus, files, programs, media, components, networks, systems, 
subsystems, or means, or (c) a functional relationship in which the operation 
of any one or more thereof depends, in whole or in part, on the operation of 
any one or more others thereof. 

[00011] The term "signature" as used herein means a data set derived 
from the content of media data. 

[00012] The terms "communicate" and "communication" as used herein 
include both conveying data from a source to a destination, and delivering 
data to a communications medium, system or link to be conveyed to a 
destination. 

[00013] The term "processor" as used herein data means processing 
devices, apparatus, programs, circuits, systems and subsystems, whether 
implemented in hardware, software or both. 

[00014] In accordance with an aspect of the present invention, a method 
is provided for measuring usage of media data received at a user location, the 
media data being reproducible as comprehensible images or comprehensible 
sounds and having ancillary codes in at least some of the media data. The 
method comprises receiving the media data in a monitoring device at the user 
location; forming a data set in the monitoring device from the media data by 
including in the data set, data sufficient to decode the ancillary codes in the 
media data or to form a signature to identify the media data, while excluding 
from the data set, data required either to reproduce the comprehensible 
images or the comprehensible sounds; communicating the data set to a 
processing system located remotely from the user location; and at the 
remotely located processing system, carrying out at least one of (a) detecting 
the ancillary codes based on the data set; and (b) producing a signature 
characterizing the media data based on the data set and matching the 
produced signature with a reference signature associated with identification 
data for the media data. 
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[00015] In accordance with another aspect of the present invention, a 
nriethod is provided for measuring the usage of media data received at a user 
location, the media data being reproducible as comprehensible images or 
comprehensible sounds and having ancillary codes in at least some of the 
media data. The method comprises receiving a data set at a processing 
system located remotely from the user location, the data set including data 
sufficient to decode the ancillary codes in the media data or to form a 
signature to identify the media data, while excluding data required either to 
reproduce the comprehensible images or the comprehensible sounds; and at 
the remotely located processing system, carrying out at least one (a) 
detecting the ancillary codes based on the data set; and (b) producing a 
signature characterizing the media data and matching the produced signature 
with a reference signature associated with identification data for the media 
data. 

[00016] In accordance with still another aspect of the present invention, 
a system is provided for measuring usage of media data received at a user 
location, the media data being reproducible as comprehensible images or 
comprehensible sounds and having ancillary codes in at least some of the 
media data. The system comprises means for receiving a data set at a 
processing system located remotely from the user location, the data set 
including data sufficient to decode the ancillary codes in the media data or to 
form a signature characterizing the media data, while excluding data required 
either to reproduce the comprehensible images or the comprehensible 
sounds; and processing means located at the processing system for carrying 
out at least one of (a) detecting the ancillary codes based on the data set; and 
(b) producing a signature characterizing the media data and matching the 
produced signature with a reference signature associated with identification 
data for the media data. 

[00017] In accordance with a further aspect of the present invention, a 
system is provided for measuring usage of media data received at a user 
location, the media data being reproducible as comprehensible images or 
comprehensible sounds and having ancillary codes in at least some of the 
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media data, comprising means for receiving the media data at the user 
location; means at the user location for forming a data set from the media 
data by including in the data set, data sufficient to decode the ancillary codes 
in the media data or to form a signature to identify the media data, while 
excluding from the data set, data required either to reproduce the 
comprehensible images or the comprehensible sounds; means for 
communicating the data set to a processing system located remotely from the 
user location; and processing means at the processing system for carrying out 
at least one of (a) detecting the ancillary codes based on the data set; and (b) 
producing a signature characterizing the media data based on the data set 
and matching the produced signature with a reference signature associated 
with identification data for the media data. 

[00018] In accordance with a yet still further aspect of the present 
invention, a system is provided for measuring usage of media data received at 
a user location, the media data being reproducible as comprehensible images 
or comprehensible sounds and having ancillary codes in at least some of the 
media data. The system comprises a communications device at a processing 
facility located remotely from a user location, the communications device 
having an input to receive a data set including data sufficient to decode the 
ancillary codes in the media data or to form a signature to identify the media 
data, while excluding data required to either reproduce the comprehensible 
images or the comprehensible sounds; and a processor located at the 
processing facility and coupled with the communications device to receive the 
data set and operative to carry out at least one of (a) detecting the ancillary 
codes based on the data set; and (b) producing a signature characterizing the 
media data based on the data set and matching the produced signature with a 
reference signature associated with identification data for the media data. 

[00019] In accordance with still another aspect of the present invention, 
a system is provided for measuring usage of media data received at a user 
location, the media data being reproducible as comprehensible images or 
comprehensible sounds and having ancillary codes in at least some of the 
media data. The system comprises a monitoring device at the user location 
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and having an input to receive the media data; the first processor at the user 
location coupled with the monitoring device to receive the media data and 
operative to form a data set including data sufficient to decode the ancillary 
codes in the media data or to form a signature to Identify the media data, 
while excluding from the data set, data required either to reproduce the 
comprehensible images or the comprehensible sounds; a first 
communications device coupled with the first processor to receive the data set 
and operative to communicate the data set to a processing system located 
remotely from the user location; a second communications device at the 
processing system coupled with the first communications device to receive 
the data set; and a second processor at the processing system and having an 
input coupled with the second communications device to receive the data set 
received by the second communications device, the second processor being 
operative to carry out at least one of (a) detecting the ancillary codes based 
on the data set; and (b) producing a signature characterizing the media data 
based on the data set and matching the produced signature with a reference 
signature associated with identification data for the media data. 

[00020] In accordance with a further aspect of the present invention, a 
method is provided for measuring usage of media data received at a user 
location. The method comprises receiving media data representing 
information in a monitoring device at the user location; forming a data set in 
the monitoring device representing some, but not all, of the information 
represented by the media data; communicating the data set to a processing 
system located remotely from the user location; and at the processing system, 
carrying out at least one of: (a) detecting an ancillary code for the media data 
based on the data set; and (b) obtaining identification data for the media data 
by producing a signature for the media data based on the data set and 
matching the produced signature with a reference signature associated with 
the identification data. 

[00021] In accordance with another aspect of the present invention, a 
method is provided for measuring usage of media data representing 
information and received at a user location. The method comprises receiving 
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a data set at a processing system located remotely from the user location, the 
data set representing some, but not all, of the information represented by the 
media data; and at the processing system, carrying out at least one of: (a) 
detecting an ancillary code for the media data based on the data set; and (b) 
obtaining identification data for the media data by producing a signature for 
the media data based on the data set and matching the produced signature 
with a reference signature associated with the identification data. 

[00022] In accordance with a still further aspect of the present invention, 
a system is provided for measuring usage of media data representing 
information received at a user location. The system comprises means for 
receiving a data set at a processing system located remotely from the user 
location, the data set representing some, but not all, of the information 
represented by the media data; and processing means located at the 
processing system, for carrying out at least one of: (a) detecting an ancillary 
code for the media data based on the data set; and (b) obtaining identification 
data for the media data by producing a signature for the media data based on 
the data set and matching the produced signature with a reference signature 
associated with the identification data. 

[00023] In accordance with a yet still further aspect of the present 
invention, a system is provided for measuring usage of media data received at 
a user location. The system comprises means for receiving media data 
representing information at the user location; data set forming means at the 
user location for forming a data set representing some, but not all, of the 
information represented by the media data; means for communicating the 
data set to a processing system located remotely from the user location; and 
processor means at the processing system, for carrying out for at least one of: 

(a) detecting an ancillary code for the media data based on the data set; and 

(b) obtaining identification data for the media data by producing a signature 
for the media data based on the data set and matching the produced 
signature with a reference signature associated with the identification data. 

[00024] In accordance with yet another aspect of the present invention, 
a system is provided for measuring usage of media data representing 
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information received at a user location. The system comprises a 
communications device at a processing facility located remotely from tlie user 
location having an input to receive a data set representing some, but not all, 
of the information represented by the media data; and a processor located at 
the processing facility and coupled with the communications device to receive 
the data set and operative to carry out at least one of: (a) detecting an 
ancillary code for the media data based on the data set; and (b) obtaining 
identification data for the media data by producing a signature for the media 
data based on the data set and matching the produced signature with a 
reference signature associated with the identification data. 

[00025] In accordance with yet still another aspect of the present 
invention, a system is provided for measuring usage of media data received at 
a user location. The system comprises a monitoring device at the user 
location and having an input to receive media data representing information; a 
first processor at the user location coupled with the monitoring device to 
receive the media data and operative to form a data set representing some, 
but not all, of the information represented by the media data; a first 
communications device coupled with the first processor to receive the data set 
and operative to communicate the data set to a processing system located 
remotely from the user location; a second communications device at the 
processing system coupled with the first communications device to receive 
the data set; and a second processor at the processing system and having an 
input coupled with the second communications device to receive the data set 
received by the second communications device, the second processor being 
operative to carry out at least one of: (a) detecting an ancillary code for the 
media data based on the data set; and (b) obtaining identification data for the 
media data by producing a signature for the media data based on the data set 
and matching the produced signature with a reference signature associated 
with the identification data. 

Brief Description Of The Drawings 

FIGURE 1 is a block diagram of an advantageous embodiment of the 
invention; 
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FIGURE 2 is a flowchart for use in describing an operation of the 
Figure 1 enribodiment; 

FIGURE 3 is a flowchart for use in describing an embodinnent of the 
invention for producing identification data from audio and/or acoustic media 
data; and 

FIGURE 4 is a flowchart for use in describing one alternative for 
implementing the embodiment of Figure 3. 

Detailed Description Of Certain Advantageous Embodiments 

[00026] Figure 1 illustrates an embodiment of a system for measuring 
usage of media data representing information received at a user location. 
The system includes a monitoring device 20 at the user location which 
monitors media data, as indicated by Step 25 in Figure 2. Where acoustic 
data including media data, such as audio data, is monitored, the monitoring 
device 20 typically would be a microphone having an input which receives 
media data in the form of acoustic energy and which serves to transduce the 
acoustic energy to electrical data. Where media data in the form of light 
energy, such as video data, is monitored, the monitoring device 20 takes the 
form of a light-sensitive device, such as a photodiode, or a video camera. 
Light energy including media data could be, for example, light emitted by a 
video display. The device 20 can also take the form of a magnetic pickup for 
sensing magnetic fields associated with a speaker, a capacitive pickup for 
sensing electric fields or an antenna for electromagnetic energy. In still other 
embodiments, the device 20 takes the form of an electrical connection to a 
monitored device, which may be a television, a radio, a cable converter, a 
satellite television system, a game playing system, a VCR, a DVD player, a 
portable player, a computer, a web appliance, or the like. In still further 
embodiments, the monitoring device 20 is embodied in monitoring software 
running on a computer or other reproduction system to gather media data. 

[00027] In certain embodiments, the monitoring device 20 is 
implemented as a stationary monitoring device positioned near a television, 
radio, computer, web appliance, a cable converter, a satellite television 
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system, a game playing system, a VCR, a DVD player, or the like. In other 
embodiments, the monitoring device 20 is implemented as a portable device 
to be carried about by a user in order to gather data regarding media data to 
which the user is exposed. 

[00028] The monitoring device 20 is coupled with an input of a processor 
30 at the user location, so that the processor 30 can receive the media data 
from the monitoring device. The processor 30 is operative to produce a data 
set representing some, but not all, of the information represented by the 
media data, as indicated by Step 35 of Figure 2. 

[00029] The processor 30 proceeds to form the data set by eliminating 
portions of the media data which are not required for further processing at a 
remote location where either a code (such as an ancillary code and/or 
identification code) is detected from the data set, or a signature is formed for 
matching against a library of signatures representing known media data, or 
both of these processes are carried out. 

[00030] With reference again to Step 35 of Figure 2, in one 
advantageous embodiment, the processor 30 transforms the received media 
data into frequency-domain data and then selects certain portions of the 
frequency-domain data in order to form the data set. In accordance with 
certain alternatives of this embodiment, the media data is transformed into 
frequency-domain data in the form of amplitude data for each of a plurality of 
frequency ranges. Each of these ranges corresponds to a pre-determined 
identification code component and/or ancillary code component which may be 
present in the media data. In certain ones of these embodiments, the 
amplitude data are formed by producing ratios of amplitude data in certain 
frequency ranges to noise levels based on amplitude data outside such 
frequency ranges. In one variant of this technique, the ratios are formed as 
signal-to-noise ratios. 

[00031] In still other embodiments, the data set is formed of time-domain 
data. In certain embodiments, the data set is formed by sub-sampling time- 
domain data, or by averaging or combining values of such data over time, or 
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by eliminating time segments of the data. In other embodiments, the time- 
domain data is produced by selecting a portion of such time-domain data from 
a frequency range narrower than a frequency range of the media data. In 
some such embodiments, this time-domain data is formed by filtering the 
media data. 

[00032] In yet still further embodiments, the data set comprises data 
representing phase information. Alternative techniques for forming such 
phase information include comparing the phases of simultaneously occurring 
components of the media data from different respective frequency ranges or 
bins, or which constitute one or more single-frequency components, or by 
comparing time-displaced media data values or through a combination of 
such techniques. 

[00033] A communications device 40 is coupled with the processor 30 to 
receive the data set. The communications device 40 communicates this data 
set via a communication system, link or medium 50 to a remotely located 
processing system comprising a further communications device 70 and a 
remote processor 60, as indicated by Step 45 of Figure 2. In certain 
embodiments, the communications device 40 is a modem or network card 
which transforms the data set into a format appropriate for communication via 
telephone network, a cable television system, a WAN or a wireless 
communications system. In embodiments which communicate the data 
wirelessly, the communications device 40 includes an appropriate transmitter, 
such as a cellular telephone transmitter, a wireless Internet transmission unit, 
an optical transmitter, an acoustic transmitter or satellite communications 
transmitter. 

[00034] The device 70 is selected as appropriate, to be coupled with the 
device 40 to receive the data set as communicated thereby via the system, 
link or medium 50. The communications device 70 is coupled with remote 
processor 60 to provide the data set thereto for producing identification data, 
as indicated by Step 55 of Figure 2. 
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[00035] In certain embodiments, the remote processor 60 processes the 
data set to detect an identification code for the media data and/or an ancillary 
code therein, based on the data set. In other embodiments the remote 
processor 60 carries out a pattern matching process, by producing a signature 
for the media data based on the data set and matching the produced 
signature with a reference signature which is made available at the remotely 
located processing system. In some embodiments the reference signature is 
obtained from a database maintained at the remotely located processing 
system, while in others the reference signature is obtained from a remote 
source, such as a server which accesses a remotely located database. 

[00036] The reference signature is associated with identification data 
serving to identify the media data from which the reference signature has 
been obtained. Accordingly, once a reliable match of the produced signature 
with a reference signature has been achieved, the identification data 
associated with the reference signature serves to identify the media data 
represented by the received data set. 

[00037] Several advantageous and suitable techniques for detecting 
identification codes in media data are disclosed in US Patent No. 5,764,763 to 
James M. Jensen, et al, which is assigned to the assignee of the present 
application, and which is incorporated by reference herein. Other appropriate 
decoding techniques are disclosed in U.S. Patent No. 5,579,124 to Aijala, et 
al., U.S. Patent Nos. 5,574,962, 5,581 ,800 and 5,787,334 to Fardeau, et al., 
U.S. Patent No. 5,450,490 to Jensen, et al., and U.S. Patent Application No. 
09/318,045, in the names of Neuhauser, et al., each of which is assigned to 
the assignee of the present application and all of which are incorporated 
herein by reference. 

[00038] Still other suitable decoders are the subject of PCT Publication 
WO 00/04662 to Srinivasan, U.S. Patent No. 5,319,735 to Preuss, et al., U.S. 
Patent No. 6,175,627 to Petrovich, et al., U.S. Patent No. 5,828,325 to 
Wolosewicz, et al., U.S. Patent No. 6,154,484 to Lee, et al., U.S. Patent No. 
5,945,932 to Smith, et al., PCT Publication WO 99/59275 to Lu, et al., PCT 
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Publication WO 98/26529 to Lu, et al., and PCT Publication WO 96/27264 to 
Lu, et al, ail of which are incorporated herein by reference. 

[00039] in certain embodiments, the processor 30 forms the data set of 
frequency-domain data and the processor 60 processes the frequency- 
domain data in the data set to detect an identification code or an ancillary 
code therein. Where the codes have been formed as in the Jensen, et ai. 
U.S. Patent No. 5,764,763 or U.S. Patent No. 5,450,490, the frequency- 
domain data is processed by processor 60 to detect code components with 
predetermined frequencies. Where the codes have been formed as in the 
Srinivasan PCT Publication WO 00/04662, the processor 60 processes the 
frequency-domain data to detect code components distributed according to a 
frequency-hopping pattern. In certain embodiments, the code components 
comprise pairs of frequency components modified in amplitude to encode 
information, and the processor 60 detects such amplitude modifications. In 
certain other embodiments, the code components comprise pairs of frequency 
components modified in phase to encode information, and the processor 60 
detects such phase modifications. Where the codes have been formed as 
spread spectrum codes, as in the Aijaia, et al. U.S. Patent No. 5,579,124 or 
the Preuss, et al. U.S. Patent No. 5,319,735, the processor 60 comprises an 
appropriate spread spectrum decoder. 

[00040] There are advantageous and suitable techniques for carrying 
out a pattern matching process to identify the media data based on the data 
set. Several such techniques are described below in connection with 
Figure 3. 

[00041] Other suitable techniques for extracting signatures from media 
data and matching these signatures to reference signatures are disclosed in 
U.S. Patent No. 5,612,729 to Ellis, et al. and in U.S. Patent No. 4,739,398 to 
Thomas, et a!., each of which is assigned to the assignee of the present 
invention and both of which are incorporated herein by reference. 

[00042] Still other suitable techniques are the subject of U.S. Patent No. 
3,919,479 to Moon, et al., U.S. Patent No. 4,697,209 to Kiewit, et al., U.S. 
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Patent No. 4,677,466 to Lert, et al., U.S. Patent No. 5,512,933 to Wheatley, et 
a!, U.S. Patent No. 4,955,070 to Welsh, et al., U.S. Patent No. 4,918,730 to 
Schuize, U.S. Patent No. 4,843,562 to Kenyon, et aL, U.S. Patent No. 
4,450,551 to Kenyon, et al., and U.S. Patent No. 4,230,990 to Lert, et al., all 
of which are incorporated herein by reference. 

[00043] In accordance with certain advantageous embodiments of the 
invention, the monitoring device 20 receives media data reproducible as 
comprehensible images or sounds at a user location, the received media data 
having ancillary codes therein. The processor 30 serves to form the data set 
from the media data by excluding data required either to reproduce 
comprehensible images or comprehensible sounds, while including data 
sufficient to decode identification codes and/or ancillary codes in the media 
data or to form a signature to identify such data. 

[00044] In certain variants of these embodiments, audio or image data 
picked up by the monitoring device 20 is either transformed to the frequency 
domain or received as frequency-domain data. Those portions of the 
frequency-domain data not useful to decode an identification code or an 
ancillary code for audio or image media data or to form a signature to identify 
such data, are eliminated. Preferably, but not exclusively, the codes have 
been added to the audio data in accordance with the inaudible encoding 
techniques of U.S. Patent No. 5,764,763. Since the codes themselves are 
inaudible in the reproduced audio data, audible portions of the audio data may 
be eliminated from the data set without loss of data required to decode the 
codes. It will be appreciated that other kinds of inaudible codes may be 
recovered in this manner. 

[00045] Similarly, where encoded image data is collected by means of 
the monitoring device 20, it is preferable that the codes to be recovered are 
visually imperceptible or minimal. In this manner, the data set may be formed 
to include data necessary to decode the codes, while eliminating data 
required to reproduce a comprehensible image. Suitable image encoding 
techniques for producing encoded images having visually imperceptible or 
minimal encoding artifacts, and decoding the same are the subject of U.S. 
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Patent No. 6,122,403 to Rhoads, U. S. Patent No. 6,208,745 to Florencio, et 
al., U.S. Patent No. 6,205,249 to Moskowitz, U.S. Patent No. 6,198,832 to 
Maes, et al., U.S. Patent No. 5,737,025 to Dougherty, et al., and U.S. Patent 
No. 5,737,026 to Lu, et al., all of which are incorporated herein by reference. 

[00046] in other variants, time domain audio or image media data 
received by the monitoring device is reduced by eliminating such portions 
which are not useful to decode such an identification code or ancillary code or 
form such a signature. Such data reduction can be achieved, for example, by 
filtering or subsampling, averaging or othenwise combining data, or eliminating 
time segments of the data. 

[00047] it is thus possible to vastly reduce the amount of data included 
in the data set, which facilitates storage and communication of the data set. It 
also preserves the privacy of audience members in the vicinity of the 
monitoring device 20 by preventing reproduction of comprehensible sounds or 
images. 

[00048] Figure 3 illustrates an advantageous embodiment in which the 
data set produced at the user location is formed so that, if an identification 
code and/or ancillary code is present in the media data, it may be extracted 
from the data set, but that if such a code is not present, the same data set 
may be used to produce a signature for use in a signature matching process. 
In Step 100 of Figure 3, time-domain audio data, such as data obtained from 
the output of a microphone, is transfen-ed to the frequency domain, by Fast 
Fourier Transform ("FFT"), wavelet transform, digital filtering, or other time-to- 
frequency domain transformation. Where the audio data is initially received in 
the form of frequency-domain data, this step Is unnecessary. 

[00049] The frequency-domain data is subject to a data extraction 
process in Step 1 10 to produce a reduced data set, such that data required to 
detect an identification code and/or ancillary code, if present, is included in the 
reduced data set, but that a substantial portion of the audio information is not 
included in the reduced data set. The reduced data set is not merely a 
compressed version of the audio signal, but also excludes data required to 
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produce a comprehensible version of the audio signal. Consequently, this 
process not only results in substantial data reduction beyond that which may 
be achieved in signal compression, but also ensures privacy. 

[00050] The reduced data set so produced is communicated from the 
user's location, as indicated by Step 120, to a remotely located processing 
system. The data set is then subjected to a code detection process 130 
carried out by examining the frequency content of the data set. If a code is 
present, as indicated in Step 140, a record of the code is created in Step 150. 
In the alternative, or in addition, the detected code is matched with 
identification data for the media data in a database accessible to the remotely 
located processing system. 

[00051] If a code is not detected, a matching process 160 is carried out. 
In the matching process, a signature is produced based on the data set. 
There are several alternative signature extraction techniques. In one, the 
entire data set is used without modification as a signature. In anther, a 
portion of the data set is selected as a signature. In yet another, a signature 
is produced based on the data set by combining or otherwise processing its 
data to produce the signature. In certain ones of such processes, pairs of 
frequency data are selected from the data set and used to form ratios 
representing components of the produced signature, as in the audio signature 
formation technique disclosed in Ellis, et al. U.S. Patent No. 5,612,729, 
incorporated herein by reference. 

[00052] The signature so produced is then compared with reference 
signatures stored in a database accessible to the remotely located processing 
system. The matching process may be carried out, for example, in the 
manner disclosed by Ellis, et al. in U.S. Patent No. 5,612,729. Once a reliable 
match is found, a record of the match is created, as indicated in Step 170. 

[00053] There are a number of suitable techniques for producing the 
reduced data set in Step 110. Where the audio signal has been encoded in 
accordance with the Srinivasan PCT Publication WO 00/04662, those 
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frequency components which may include the code components are retained, 
while those which will not are substantially excluded. 

[00054] An advantageous technique for use with audio data encoded as 
in the Jensen, et al. U.S. Patent No. 5,764,763 or U.S. Patent No. 5,450,490 
is described in connection with Figure 4. In the technique of Figure 4, the 
audio data if not already in the frequency domain, is transformed thereto by 
FFT or another suitable method as indicated in Step 200. 

[00055] Noise amplitudes in the frequency neighborhoods of possible 
code components are estimated in Step 210. This is achieved by examining 
the amplitudes of frequency components in such neighborhoods. For 
example, those components having amplitudes below a threshold, such as an 
average or mean amplitude or a fixed value, are combined and averaged or 
otherwise processed to produce a representative noise amplitude. 

[00056] Then in Step 220 signal-to-noise ratios are determined for each 
possible code component based on data amplitude at its frequency to the 
noise amplitude in its frequency neighborhood. In one embodiment, those 
ratios which exceed an upper threshold are rejected as likely representing 
non-code audio signal components, and those falling below a lower threshold 
are rejected as noise. This process is carried out in Step 230. In an 
alternative embodiment, those ratios which would exceed the upper threshold 
are nevertheless retained when the data set is formed. In still another 
embodiment, all ratios are retained, and Step 230 is omitted. 

[00057] The retained ratios are stored in Step 240 until it is appropriate 
to communicate the data set to the remotely located processing system. A 
decision is made to communicate, as indicated in Step 250, when a 
predetermined criterion is fulfilled. For example, where the data is gathered 
with a monitoring device carried by an audience member, the data may be 
communicated while the device is coupled with a base station, as in the 
Brooks, et al. U.S. Patent No. 5,483,276. The decision to communicate the 
data set may instead be determined based on an amount of stored data or on 
the lapse of time or else upon the establishment of a communication path by 
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the device for transmitting and/or receiving otiier data. Wlien the criterion for 
data communication is fulfilled, the stored data set or sets are communicated 
to the remotely located processing system as indicated in Step 260. 

[00058] Since it is possible to encode each data symbol with relatively 
few frequency components in this embodiment, there are relatively few ratios 
required in order to decode the symbols at the remotely located processing 
system. This enables the data set to be restricted in size to facilitate its 
storage and transmission. 

[00059] Although the invention has been described with reference to 
certain advantageous embodiments, arrangements of elements or steps, 
features and the like, these are not intended to exhaust or exclude all or any 
possible embodiments, arrangements or features, and indeed other 
modifications and variations will be ascertainable to those of skill in the art. 
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